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Underlay Cognitive Radios with Capacity 
Guarantees for Primary Users 

Antonio G. Marques 
Abstract 

To use the spectrum efficiently, cognitive radios leverage knowledge of the channel state information 
(CSI) to optimize the performance of the secondary users (SUs) while limiting the interference to the primary 
users (PUs). The algorithms in this paper are designed to maximize the weighted ergodic sum-capacity of 
SUs, which transmit orthogonally and adhere simultaneously to constraints limiting: i) the long-term (ergodic) 
capacity loss caused to each PU receiver; ii) the long-term interference power at each PU receiver; and iii) the 
long-term power at each SU transmitter Formulations accounting for short-term counterparts of i) and ii) are 
also discussed. Although the long-term capacity constraints are non-convex, the resultant optimization problem 
exhibits zero-duality gap and can be efficiently solved in the dual domain. The optimal allocation schemes 
(power and rate loadings, frequency bands to be accessed, and SU links to be activated) are a function of the 
CSI of the primary and secondary networks as well as the Lagrange multipliers associated with the long-term 
constraints. The optimal resource allocation algorithms are first designed under the assumption that the CSI 
is perfect, then the modifications needed to accommodate different forms of imperfect CSI (quantized, noisy, 
and outdated) are analyzed. 

Index Terms 

Cognitive radios, resource management, stochastic approximation, imperfect channel state information. 

I. Introduction 

Cognitive radios (CRs) implementing dynamic spectrum access (DSA) schemes are the next generation 
solution for the problem of deploying new wireless services in an overcrowded radio environment [12], [10]. 
CR users, typically referred to as secondary users (SUs), have to sense the radio spectrum and use the sensing 
measurements to adapt dynamically the configuration of the CR. Such tasks have to be carried out with the 
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aim of optimizing the quality of service (QoS) of the SUs while limiting the interference to the receivers 
which hold the Ucence of the frequency band, referred to as primary users (PUs). The specific rules that 
establish how SUs and PUs coexist and how the interference is limited depend on the so-called CR paradigm 
considered (underlay, overlay, or interweave [10]) and the DSA policy implemented [31]. 

The merits of adaptive schemes for traditional wireless systems that first acquire knowledge of the channel 
state information (CSl) and then use the CSI to optimally allocate the transmit resources are well documented; 
see [9]. However, for channel-adaptive schemes to be deployed in CR scenarios [20], [26], [27], [13], important 
challenges not present in traditional wireless networks arise. Next we describe several of them. 

Challenge 1: Sensing the CR spectrum and acquiring the corresponding CSI (especially the one of the 
primary network) is a difficult task. The CSI in CRs is heterogeneous (presence of PUs, SU-to-PU channels, 
SU-to-SU channels, PU-to-PU channels) and inherently distributed. Some PUs can be located far away and 
not willing to collaborate with the SUs. The CSI may also vary fast and, due to interference, might not 
be stationary. Furthermore, to become aware of the overall radio environment, not only channels but also 
additional (network) side information may need to be sensed/estimated [10]. As a result, the CSI in CRs 
has higher dimensionahty and heterogeneous quality (information of SU-to-SU links is typically better than 
that of SU-to-PU). Hence, advanced signal processing schemes that keep track of the CSI and mitigate the 
existing uncertainties have to be implemented. To deal with these problems, most CR works consider that the 
CSI contains some type of imperfections. Such imperfections are typically modeled as either noisy CSI (the 
actual CSI is corrupted with additive noise [20]) or quantized CSI (only a coarse description of the channel 
CSI is available, [19], [15]). Fewer works have considered the fact that the CSI may be not only noisy but 
also outdated [4], [17]; have developed signal processing schemes to mitigate the CSI uncertainties; or have 
incorporated those imperfections into the design of resource allocation (RA) algorithms [20], [25], [1], [5]. In 
this paper we take a general approach to model the CSI imperfections and consider that the distribution of the 
instantaneous CSI (referred to as belief) is available. This will allow us to: i) consider simultaneously different 
sources of CSI imperfections; and ii) address the design of systems with a broad degree of CSI uncertainties 
(from almost perfect CSI to severely degraded CSI). The expression for the belief and the rules to update it 
will depend on the operating conditions of the system. For example, if the CSI is perfect, the belief coincides 
with the instantaneous channel measurements. On the other hand, if only statistical CSI is available, the beUef 
coincides with the long-term distribution of the channel and does not vary with time. 

Challenge 2: As already mentioned, CR transmissions must obey additional rules that establish how SUs 
and PUs coexist and how to control interference. Such rules are typically formulated as constraints and depend 
on the specific CR paradigm and the DSA policies implemented. Overlay CRs (referred to as interweave CRs 
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in [10]) allow SUs to transmit only if PUs are not active. Differently, underlay CRs allow for SU transmissions 
provided that the damage (interference) to the PUs is not too high. To keep the interference low, some works 
limit the interference power at the primary receiver side, either by imposing instantaneous (short-term) or 
average (long-term) interference power constraints; see, e.g., [13], [30], [29], [11]. The latter are better suited 
for fading channels because they can exploit the diversity of the interfering link [30], [11]. Other works 
guarantee a minimum signal-to-interference-plus-noise ratio (SINR) at the PU receiver [14], [8]. Short-term 
SINR constraints can be easily translated to (short-term) interference power constraints, while long-term SINR 
constraints caimot. More recent designs use a probabilistic approach to limit the probabihty of interfering the 
primary transmissions [26], [27], [2], [17]. Other works have designed schemes either guaranteeing a minimum 
capacity (rate) for the PU or hmiting the capacity-loss at the PU receiver [8], [19]. Providing guarantees on 
the capacity of the PU links is typically a non-convex problem, so that most works have developed suboptimal 
solutions and focused on short-term formulations, which are more tractable and in some cases can be rendered 
convex [8]. In this paper we consider that PUs are not always active. When the channels are not occupied, the 
SUs are allowed to transmit (overlay paradigm). When the PUs are active, the SUs transmissions adhere to 
diverse DSA constraints (short and long term interference power and rate loss) that guarantee that the damage 
to PUs is kept under control (underlay paradigm). 

Challenge 3: CRs have to use the time- varying (imperfect) CSI to dynamically adapt the available resources 
(power and rate loadings of the SUs) and decide the frequency bands to be used and the specific SUs that will 
use them. Relative to the RA in traditional wireless systems, the problem in CRs is challenging not only because 
more variables are involved, but also because the description of the CSI is more comphcated and the schemes 
have to satisfy the additional DSA constraints. Different approaches have been used to formulate and solve the 
RA problem: game theory [21], non-linear optimization [29], convex approximation [5], dynamic programming 
[4], adaptive control [26] and even bio-inspired models [6]. In this paper, we design the RA schemes using 
non-linear optimization and dual stochastic approximation tools. The stochastic schemes are robust to channel 
non-stationarities and require less computational burden than that of the (non-stochastic) allocation schemes. 
Moreover, they are well suited for dealing with CSI imperfections. Dual stochastic algorithms have been 
successfully used to allocate resources in wireless networks, see, e.g., [23], [18] and [19], [27] for examples 
in the context of CRs. 

Motivated by these challenges, we design RA algorithms that optimize the rate performance of the SUs and 
limit the interference to the PUs. We focus on CRs where SUs adapt their power and rate loadings dynamically, 
and access orthogonally a set of frequency bands which are primarily devoted to PU transmissions. Orthogonal 
here means that if a SU is transmitting, no other SU can be active in the same band. The RA schemes are 
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then obtained as the solution of a weighted sum-average capacity maximization subject to four types of 
constraints: i) limits on the long-term (ergodic) capacity loss inflicted to each PU; ii) hmits on the long-term 
interference power at each PU [11]; ill) limits on the long-term power transmitted by each SU; and iv) short- 
term formulations of i) and ii). Consideration of i) is challenging because the interfering (SU) powers render 
the capacity term non-convex, and it is the main contribution of this work. Although non-convex, it holds that 
the formulated problem has zero duahty gap. As a result, the Langrangian relaxation is optimal. Additionally, 
the operating conditions of the secondary network (and the formulation of the objective to optimized) are such 
that the problem in the dual domain can be separated across users and frequency bands. This favorable structure 
allows for a significant reduction on the complexity required to find the optimal solution and, hence, renders 
the non-convex problem computationally tractable. Different forms of channel imperfections are considered 
(quantized, noisy, outdated, statistical). The optimal RA schemes are complemented with simple but effective 
stochastic signal processing algorithms both to mitigate the effects of the CSI imperfections, and to estimate 
online the value of the multipliers required to implement the optimal RA. Such stochastic algorithms are able 
to track the time-variation of the environment and/or leam unknown parameters on-the-fly, features that are 
especially attractive for CR systems [12], [19]. 

The rest of the paper is organized as follows. Sec. n presents the model for the (perfect) CSI, describes the 
operating conditions of the secondary network, and formulates the DSA constraints that SUs must obey. Sec. 
m deals with the design of the optimal RA algorithms. First, the optimization problem which gives rise to the 
RA is formulated and then, its solution is obtained. Sec. IV discusses different methods (including stochastic) 
to estimate the multipliers required to implement the optimal RA. Sec. V describes different forms of CSI 
imperfections and analyzes how the optimal schemes have to be modified to account for imperfect CSI. Sec. 
VI presents different illustrative numerical examples that corroborate the theoretical claims. Conclusions in 
Sec. VII wrap-up this paper. ^ 

II. Model description 

We consider a CR network with M secondary users (indexed by m) transmitting opportunistically and 
orthogonally over K different frequency bands (indexed by k). For simplicity, we assume that: i) each band 
has the same bandwidth and is occupied by a different primary user; and ii) the secondary network has an 
access point (AP) which is the destination of all secondary users. The AP acts as a central scheduler which 

' dotation: denotes vector transposition; x* the optimal value of variable x; A (V) the Boolean "and" ("or") operator; E[-] 
expectation; l^.j the indicator function {l^^} = 1 if a; is true and zero otherwise); and [a:]^ the projection of the scalar x onto the 
interval [a,b], i.e., [a:]^ := min{max{a;, a}, 6}. 
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collects the CSI and then makes the RA decisions. Extensions to scenarios where those assumptions do not 
hold trae can be handled with a moderate increase in complexity. 

A. Channel state information 

Intuitively speaking, the CSI in wireless systems comprises the information of the channel links which: i) is 
known by the system and ii) is relevant from a RA perspective. A key feature of CR systems is that the CSI 
is heterogeneous, meaning that it is typically different for the primary and secondary network. The reason for 
that is twofold. First, the schemes used to acquire the CSI are different for the primary and secondary network 
[cf. i)]. Second, the impact of the CSI on the design of the RA is different [cf. ii)]. For ease of exposition, 
we first design the RA schemes assuming that the CSI is error-free. Accordingly, the model for the perfect 
CSI is presented here, while the model for imperfect CSI (and the corresponding modifications for the RA 
schemes) is presented in Sec. V. 

The CSI available at instant n is formed by variables: afe^i[n], /i™^[n], and /i™2[^] ^ ^^'^ B^forc 

explaining the meaning of such variables, we clarify that subscript "1" will be used to emphasize that the 
channel involves primary transceivers, while subscript "2" is used to emphasize that only secondary transceivers 
are involved. Starting with the CSI of the PUs, afe^i [n] is a Boolean variable which is one if the PU that transmits 
on the fcth channel is active at time n and zero otherwise. Variable h^^ [n] represents the instantaneous noise- 
normahzed power gain between the mth SU and the A;th PU at instant n. Similarly, /i^2M represents the 
instantaneous noise-normahzed power gain between the mth SU and the AP in the kth channel at instant n. 
All ttfc i[n], and /i™2[^] stationary random processes. The assumption of perfect CSI implies that 

at instant n, the value of those variables is known deterministically. Finally, we will use 7^ to denote the 
(interference free) signal-to-noise ratio (SNR) between the PU transmitter and PU receiver. For simplicity, 
we will assume that 7^ does not vary with time (either because the PU channels are fixed or because the 
PU transmitter implements a channel-inversion power loading [9]). Nonetheless, our schemes can be easily 
modified to account for 7^ varying with time. 

To finish this section, let h denote the K{2M -|- 1) x 1 vector of overall CSI containing: i) the power gains 
of the MK CR-to-CR hnks, and ii) the normalized power gains of the MK CR-to-PU links; and iii) K 
Boolean variables indicating whether the channels are occupied. Clearly, the value of h varies with time and, 
wherever convenient, we will write h[n] to stress this fact. 
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B. Resources at the secondary network 

Now, we introduce the design variables, i.e, the variables that will be adapted as a function of the (primary 
and secondary) CSI h. Let denote a Boolean variable taking the value one if the mth secondary user 
is scheduled to transmit into the fcth band and zero otherwise. Provided that — P™2 denote the 

instantaneous power transmitted over the fcth band by the mth secondary user. We analyze the case where 
instantaneous rate and power variables are coupled through Shaimon's capacity formula. Such a coupling will 
be written as 'r^2^^-2P'k'2) ■~ log2(l + ^fe*2^'fe*2)' which is an increasing and concave function. Nonetheless, 
the basic results in this paper hold for any r^2( ) increasing and concave. 

The CR operates in a time-block fashion, where the duration of each block corresponds to the coherence 
time of the fading channel. This way, at every time n the AP will use the current CSI vector h to find the 
(optimum) value of ^^d p™2- Since h varies with n and {Wk''2,P^2} depend on h, the value of the design 
variables {^t'fc 2'^'™2} ^^^^ across time as well. Throughout the manuscript, we will write h, w^^O^) ^^d 
p^2(h), or h[n], u;™2[^'] and p™2["]' wherever is convenient to emphasize the corresponding dependence. 

Having introduced the design variables, now we formulate constraints that these variables need to satisfy. 
To ensure that at most one user transmits into a given band k, we need w^20^) ^ 1- If the left hand side 
of the constraint is equal to one, then one user is accessing the chaimel (orthogonal access). If it is equal to 
zero, then none is transmitting (either because all secondary chaimels are poor, or because it causes very high 
interference to the PUs). To simplify the notation, we consider an additional virtual SU user m = 0, with zero 
transmit power and rate; i.e., Pk 2 ~ ''^k 2 ~ ^- ^^^^ ^^^^ active (and thus 2 = 1) if none of the 

actual SUs is transmitting. Then, we can write 



We also consider that the maximum average (long-term) power the mth SU can transmit is ^2*; hence. 



Such a constraint is not only reasonable to effect QoS across CRs, but also to limit the power consumption 
of each of the CR transmitters. The expectation in (2) is taken over all possible values of afe,i[n] and 
/i™2[^]' i-^' considering all m, k, and n. While (1) needs to hold for each and every chaimel realization (hence, 
for each and every time instant), (2) only needs to hold in the long term. 

C. Dynamic spectrum access constraints 

The next step is to identify the rules that dictate how SU transmissions affect the performance of the PUs. 
Such rules will be formulated as constraints that will be incorporated into the optimization problem that gives 




(1) 




(2) 
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rise to the RA schemes. In other words, the DSA constraints will represent how SUs have to modify their 
behavior so that the damage caused to the PUs is kept under control. 

When the DSA constraints are formulated, several factors have a significant impact both in terms of the 
system operation and the mathematical formulation of the problem. Two important ones are discussed next. The 
first factor is whether the interference constraints are formulated as instantaneous (short-term) or as average 
(long-term) constraints. The former requires the constraint to hold for each and every time instant, while 
the latter requires the constraint to hold on average (taking into account all time instants jointly). Clearly, 
instantaneous constraints are more restrictive than their average counterparts (which can exploit the so-called 
"cognitive diversity" of the primary CSI [30], [29]), and therefore the performance of the secondary network 
will be higher in the latter case. Mathematically, long-term constraints are typically duaUzed, while short-term 
constraints are handled using alternative methods. The second factor is the metric used to measure the actual 
damage that the CRs inflict to the PUs. Among the metrics considered in the literature we find: interference 
power at the PUs, probability on interfering the PUs, and rate loss inflicted to the PUs. Most works have 
focused on limiting the interference power. The reason is twofold: i) it is a simple (and intuitive) metric 
to measure the interference, and ii) it can be formulated as a convex constraint. Limiting the rate loss may 
be considered a better alternative because it focuses on the actual damage that the interference causes to 
the PUs (most communications systems are designed to either guarantee or maximize a certain transmission 
rate). From a mathematical perspective, constraints limiting the rate loss are typically non-convex. As a result, 
very few works have explored that alternative; see e.g. [8], [19]. The problem of hmiting the probabihty of 
interference for a system with operating conditions very similar to the ones considered in this paper was 
thoroughly investigated in [17]. 

As already mentioned, the main contribution of this work is to limit the long-term rate (capacity) loss on 
the PUs. However, we will also impose limits on the long-term interference power. The reason is twofold. 
First, such constraints were not considered for systems with the same exact operating conditions than those 
considered in this work; see [11] for a very related one. More importantly, joint consideration of rate loss and 
interference power constraints will help us to compare these two alternatives. For similar reasons, the end of 
the section is devoted to discuss the modifications required to handle short-term interference power and rate 
loss constraints. 

We start with the formulation of the long-term interference power constraints. Let i denote the maximum 
average interference power the feth primary receiver can tolerate (provided that the PU is active) and recall 
that the mth SU transmits in the A;th channel only if the Boolean scheduling variable u)^2(h) is one. Then, 
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the following K constraints need to hold 



Eh 



E<2(h)/i^i<2(h) 



< PkA, Vfc. 



(3) 



The fact that the expectation is taken across all h reflects that (3) is a long-term constraint. Clearly, for a 
given channel realization h just one of the M + 1 terms inside the expectation is active. This property will be 
exploited in upcoming sections. Finally, note that only CSI realizations for which i = 1 are considered in 

the expectation. In fact, (3) can be rewritten as Eh[afc,i 'l2m'^T20^)^TiPT20^)] — ^h[0'k,iPk,i]- If one does 
not want to bound the long-term interference power that the PU receives when it is active, but the long-term 
power at the PU receiver irrespective of whether the PU is active of not, then i has to be removed from 
the previous expressions. 

Next, we formulate the long-term (ergodic) capacity constraints. For such a purpose we define the function 
i~k,iix) '■= log2 ^1 + 1+5 j> where x stands for the interference power at the kth PU receiver. Our formulation 
guarantees a minimum long-term rate of fjt i for the kth PU. This minimum rate can either be a fixed value 
[19] or expressed as a fraction of the rate that the PU achieves when no CRs are present. Mathematically, 
the rate requirement in the latter case can be written as i := (1 — eA:)Eh [ak,irk,i{0)] where € (0, 1) is 
the maximum (relative) capacity loss that the CRs can cause to the kth PU. With these issues in mind, the 
long-term capacity constraint is formulated as 



Eh 



ak,i = 1 



> rhi, yk. 



(4) 



Again, for a given channel realization h only one of the M + 1 terms inside the expectation is active. The 
expression in (4) confirms that if the constraint is written as /(p^"2(l^)) ^ 0' ^^^^ /(') ^ non-convex function 
[cf. the definition of i( )]. 

We close this section by briefly discussing the formulation of the short-term DSA constraints. To write the 
short-term counterparts of (3) and (4) we do not need to take into account all h, but only the current one 
h[n]. Hence, the short-term constraints for the time instant n are 

afc,iN ^kMK'MPkM < ak,i[n]Pk,i, (5) 



(6) 



which need to hold for all k and n. Capitalizing on the fact that at every time instant only one SU is active, 
the altemative set of constraints can be considered 



Ofc,iN^fe,i(^^iNp^2N) > ak,i[n]fk,i, 



(7) 

(8) 
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which in this case need to hold for all k, m and n. Clearly, if (7) and (8) are satisfied, then (5) and (6) are 
satisfied too. It can also be rigorously shown that (7) and (8) do not imply a loss of optimaHty relative to (5) 
and (6). As already pointed out, key for showing this result is that at every time instant at most one SU is 
active, so that bounds on the non-active users are irrelevant. The main advantage of considering (7) and (8) is 
that the transmit powers of the different SUs are decoupled, so that each of the MK expressions in (7) and 
(8) can be solved with respect to (w.rt.) p^2[''^]- This imphes that the constraints can be rewritten as simple 
box constraints. To be specific, let p^^g^ represent the maximum power the amplifier at the SU can transmit. 
Moreover, assume that afe_i[n] = 1 and let x^[n] and y^[n] be, respectively, the values of p^2['^] which 
the constraints (7) and (8) are satisfied with equahty. Based on these notational conventions, we define the 
maximum short-term power as p^aH ■= ^fc^max Ofe.iH = 0, and p^^aN •= "li'^i^rN) yrNj^fe^max} 
Ofc i[n] = 1. Then, the short-term DSA constraints can be replaced with 'P^'^(p\ < V^'^(p\- In a nutshell, the 
orthogonal access among SUs allow us to rewrite the short-term DSA constraints as time-varying power peak 
constraints. The power bound enforced by each of such peak constraints will depend on the metrics used to 
measure the interference (rate loss and/or interference power), the limits set on the chosen metric and 
ffc^i), and the CSI at instant n. 

III. Formulating and solving the RA problem 
To formulate the optimization problem that gives rise to the optimum RA algorithms, we need to identify: 
i) the variables to be optimized; ii) the constraints the variables need to satisfy; and iii) the metric to be 
optimized. The first step was accomplished in Sec. II-B. Regarding the second step. Boolean variables w^-^i^ 
are constrained to belong to the set {0, 1} and variables ^^^{Ja) are constrained to belong to the set [Ojp^gC^)]' 
where p^gC^^) stands for the instantaneous peak power constraint introduced at the end of Sec. II-C. Moreover, 
i(;^2(h) and p™2(h) need to satisfy (1) and (2), and the DSA constraints in (3) and (4). 

Regarding the third step (metric to be optimized), we are interested in maximizing the weighted ergodic sum- 
capacity given by C2 := Xlfem^h /^'^'"^fc^2('^)^fc 2(^™2?'™2('^)) ' where > represents a user-dependent 
priority coefficient. Note that by varying the border of the capacity region can be found [28]. 

Recall that for a given channel reahzation h and channel k only one of the M -\-\ terms (SUs) is active. 
Other objective functions, such as ergodic sum-utility rate could be used without changing the basic structure 
of the solution; see, e.g., [18] for further details on a related problem. 

Under all previous considerations, the optimal RA is obtained as the solution of the following problem: 
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(9a) 



(h) < pT,2 



(h), (1); 



(9b) 



(2), (3), (4); 



(9c) 



where the dependence of the optimization variables on the CSI h has been made expUcit. Note that we 
are interested in optimizing a long-term objective (9a), subject to both short-term (9b) and long-term (9c) 
constraints. As we will see in the next section, the approach to handle (9b) and (9c) will not be the same. 

A. Optimal RA 

The main challenge of finding the optimal RA is that (9) is not a convex problem. Basically, there are three 
sources of non-convexity in (9): i) scheduling coefficients w'^^ constrained to belong to {0, 1}, which is 
a non-convex set; ii) the monomials w'^2P^2^ ''^^2^^ jointly convex; and iii) the constraint (4) is 

not convex w.r.t. p^2- The two first sources on non-convexity can be "easily" bypassed by transforming 
(relaxing) the problem in (9) into a convex one which yields the same optimality conditions; see App. 
A for technical details. However, the third source of non-convexity cannot be bypassed. Two undesirable 
consequences associated with lack of convexity are [3]: (cl) zero-duaUty gap is not guaranteed, and (c2) 
development of numerical algorithms that find the optimal solution in polynomial time is not guaranteed. 
Remarkably, it can be shown that (see related discussion in App. A, and [24], [22]): the problem in (9) 
exhibits zero-duality gap. This result implies that the constraints can be dualized without losing optimality. 
However, (c2) still holds, so that finding an efficient algorithm to optimize the (unconstrained) Lagrangian is 
still challenging. Interestingly, due to the structure of (9) we will show that the optimization can be separated 
(decomposed) across channels and users, decreasing dramatically the computational complexity to find the 
optimal solution. 

After the previous discussion, we are ready to present the solution of (9). Our approach to deal with the 
constraints in (9) is twofold. The long-term constraints in (9c) -namely, (2), (3) and (4)- will be dualized, while 
the constraints in (9b) (all short-term) will be handled using alternative methods such as scalar projections. 
Regarding the long-term constraints, let tt"*. Ok and pk denote the Lagrange multipliers associated with (2), 
(3) and (4), respectively. With this notational conventions, it can be shown (see App. A) that the optimal 
solution of (9) is 
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arg max (^"(jJ^^sN) 



2N 

(11) 





'"^fe^2 [w] : = 1 {m=arg max; (^^ (pj^* [n] )} 1 {p^5 [n] >0 V m=0}- ( 1 2) 

Key for understanding the solution of (9) is the definition of the functional ip'^{-) in (10). Mathematically, 

represents the contribution to the Lflgra?ig/fl?i of (9) if the transmit power is p™2N = a;andu'™2N = 1- 
Intuitively, (10) can be interpreted as a user-channel quality indicator (the higher the indicator, the better). 

Under this interpretation, the rates of SUs and PUs are rewards (first and fourth terms), and the transmit and 

interference powers are costs (second and third terms). The corresponding prices are pk, vr™ and Ok, 

respectively. The indicator also manifests the existing trade-off between the SUs (first and second terms) and 

the PUs (third and forth terms). Note that if the fourth term in (10) is replaced with —pkO-k,i[n]{rk,i{^) — 

^fc,i(^fe*iHPfc*2M))' optimum value of p^2H ^^'^ '"^fc*2H (^1) ^^'^ (1^) ^'^^ change. This imphes 
that we can also interpret the quality indicator as a functional which penalizes the allocations that entail a 
high capacity loss for the PU. 

Based on the definition (/j^(p™2["^])» equation (11) reveals that p™2N found separately for each of the 
user-channel pairs. Similarly, (12) reveals that to find {^i'™2M}m=0' i-^-' optimal scheduling for channel 
k; no information from channels other than k is required. These attractive features are present because the 
optimization problem in the dual domain is separable across users and channels (see [18], [17]). Keys for 
this property to hold are the consideration of orthogonal access in the secondary network and the definition 
of the objective in (9). Capitalizing on the favorable structure of the solution, we now analyze in further 
detail the optimal RA. Starting with the optimal scheduUng in (12), we observe that tu^2['^] available in 
closed form, provided that the optimum power is known. Equation (12) reveals that the scheduling follows a 
winner-takes-all strategy, guaranteeing that the access is orthogonal (at most one user is active), opportunistic 
((/?^ is a continuous random variable), and greedy (only the user with highest quality in a given band must 
be scheduled). Note that the second condition in (12) dictates that if all users decide to transmit with zero 
power, the channel is assigned to the virtual user m, = 0. The details of the optimum power allocation are 
a bit more intricate. To obtain p™2['^] need first to maximize 'Pk^{p^2V'\) w.r.t. p^2(^\- Consider first 
a simplified case where the CR constraints (3) and (4) are not present. In such a case only the two first 
terms in (10) are present, so that ip'^{-) is strictly concave and differentiable. As a result, the optimization 
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is convex and V^] easily found. Specifically, [^\ for this case is available in closed form as 

^'™2['^] — i°g2^(6xp(i)) _ _i_jP'=,2N rpj^g previous expression is basically a waterfilling power loading [9] 
projected onto the feasible interval defined by the instantaneous constraints. When the CR constraint (3) is 
active, the third term in (10) needs to be considered. However, since that term is linear w.rt. p^'-^ln], the 
structure of is basically the same and p™2["] '^^'^ ^'■i^^ efficiently found. In fact, the solution follows 

again a (modified) waterfilling scheme p^^[n] = [ ^^fj^g^al i hf^r Jn ] ~ '^lo'"'^"'; s^^' ^-S-' t^l]- Differently, 
when all four terms in (10) are considered, the optimization is challenging because </'^(-) is not concave 
any more. The reason is that the last term is strictly convex, rendering the sum of the four terms in (10) 
non-concave and therefore, the optimization non-convex. 

However, the fact of the optimization not being convex does not necessarily imply that b^] cannot be 
efficiently found. The first reason is that optimizing <^^( ) involves a single (scalar) variable. As a result, 
simple line search methods can be used. The second reason is that the structure of ^p^{-) can be exploited 
to focus the search on a small region. For example, it can be rigorously shown that the waterfilling solution 
is an upperbound for p™2M- Moreover, if the CSI is perfect, then has at most three stationary points, 

so that p™2["^] either or one of those three points. Once {p™2[^]}m=i found, finding {ti'Ar2 1"^] }m=o 
just requires the evaluation of closed-form expressions [cf. (12)]. In other words, because in the dual domain 
the problem can be separated across users and channels, optimizing the Lagrangian does not require solving 
one non-convex problem in a (2M + 1)K dimensional space. Rather, (M + 1)K closed forms need to be 
evaluated (for the scheduling coefficients), and MK non-convex problems in a one -dimensional space need 
to be solved (for the power loadings). 

The expressions obtained in this section revealed how the optimal RA depends on the (perfect) CSI and the 
Lagrange multipliers. Schemes to compute the multipliers in our CR setup are discussed in the next section, 
while the alternatives to account for CSI imperfections are analyzed in Sec. V. 

IV. Stochastic estimation of the multipliers 

Different methods can be used to obtain the value of tt"*. Ok and p^. Based on Lagrangian Duality Theory, 
{7r"^,^fe, pk} are set to a constant value {t^"^* , 0^, P^} corresponding to the value that maximizes the dual 
function associated with (9). Since our problem has zero duality gap, when tt"* = tt"**, 9k = 0^ and pk = p^ 
are substituted into (10)-(12), the resulting RA is the optimal solution of (9) [3]. To find such values, one 
has to resort to iterative search algorithms such as dual subgradient methods [3], which at each iteration 
update the value of the multiplier according to the long-term violation of the corresponding constraint (let 
us recall that regardless of the convexity of the primal problem, the dual problem is always convex). Dual 
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subgradient methods (either with constant or diminishing stepsize) and dual descend methods are reasonable 
alternatives for the problem at hand. Methods exploiting the separability in the dual domain can be used too. 
The main drawback associated with all previous methods is that at every iteration, the expectations in the 
long-term constraints (which require averaging over all possible states of h) need to be computed. Moreover, 
the multipliers have to be recomputed if either the long-term distribution of the channels or the number of 
users change. 

Recently, alternative approaches that rely on stochastic approximation tools have been proposed to find the 
value of the multipliers [23], [19], [27]. These approaches do not try to find the optimal value of {tt"**, p^.}, 
but time- varying estimates of them {tt"* [n] , 6k [n] , pk [n] } which are updated at every instant n and remain 
sufficiently close to {t^"^* ,01, Pk}- ^ important advantage of these approaches is that their computational 
complexity is very low. Moreover, they exhibit additional advantages that are especially attractive in CR setups. 
Namely: i) they are robust to channel non-stationarities (which may arise in environments with interference); 
ii) they do not need to have statistical knowledge of the channels; and iii) they can cope with changes in either 
the secondary network (number of users, or QoS levels) or the primary network (limits on the interference 
power, rate loss, or capacity function of the PUs). In other words, stochastic schemes offer a way to learn 
the environment onUne and keep track of its time variation. As we will see, the only price to pay is that the 
resulting schemes are slightly suboptimal. 

To be specific and rigorous, with %, rje and rfp being small and constant stepsizes, the following iterations 
are proposed 



7r-[n + l] = [7^^[n]-v.{p"'-Y.k<^^''^P'^>^o ^^^^ 
dkin+i] = [Ok[n] - Veak,iH{pk,i 

-E<2N/i^i[nK:^[n])]J (14) 

m 

Pk[n+1] = ^Pk[n\ + 'npak,iH{rk,i 

- J2^<2[n]rkAhTA[n]pT,2[n]))]^ ■ ^5) 

From an optimization point of view, the updates in (13)-(15) form an unbiased stochastic subgradient of the dual 
function of (9); see [3]. Assuming that the updates in (13)-(15) are bounded, the following optimality/feasibility 
result can be shown^. 

Proposition 1: The sample average of the stochastic RA: i) is feasible and ii) entails a small loss of 
proof of this result is not presented here due to space limitations, but it can be derived following the lines of [23], [18]. 
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performance relative to the optimal solution of (9). Specifically, defining rj := m.ax{r].,r-iVO,Vp}' P2^[n]'= 

^z?=iZk<:2mm-^ := iz?=iZk,m r^fmrT,2 ihT,2[iipT,2m Pk,i[n]:= i Er=i«fe,iWE^ 

^^2 W^M WPmI^]; and fuM--= ^Er=i <^kMT.rn^kmrkAKAW^m- Then, it holds with probabiUty 
one that as n — > oo: 

i) p^[n] < p™, pk,i[n] < pk,i, fk,i[n] > fk,i, and 

ii) C2[n] > C2 — A(r7), where A{r]) as r] 0. 

In words, the proposition guarantees asymptotic optimality of the stochastic iterates because they give rise 
to a RA which is feasible and achieves a value (performance) arbitrarily close to c^, which is the optimal 
objective that the original (non-stochastic) solution of (9) achieves [cf. (9a)]. Note also that rj can be used as a 
parameter to set the tradeoff between optimality and tracking capabihties. If optimality is the only concern, the 
stochastic iterations in (13)-(15) could be run using a time-varying stepsize r)[n] which diminishes with time. 
Under mild conditions, it can be shown that such iterations converge to the optimal solution; see, e.g., [19] 
for details. Clearly, the price to pay in that case is that the algorithms would lose their tracking capabilities. 

Remark 1: In this work, we have assumed that there is a central scheduler (AP) that gathers the CSl, finds 
the optimum RA, and runs the stochastic iterates. Moreover, we have also assumed that the signalling channels 
which convey the control information are error free. Nonetheless, it is worth remarking that the stochastic 
estimates are robust to errors. In fact, if the errors in the updates are bounded and have zero mean, then the 
results in Prop. 1 still hold. See [7] for a related result. In addition, the next section will show that our schemes 
are also robust to errors/imperfections in the CSI. 

V. Imperfect channel state information 

The optimal RA schemes were designed assuming that the CSI was perfect. Here, we relax that assumption 
and account for CSI imperfections. Although the assumption of perfect CSI may be reasonable for some 
wireless systems, it is unhkely to hold in CR scenarios (see related discussion in Sec. 1). This is especially 
true for the CSI of the primary network, which is typically more difficult to obtain and entails a higher cost 
than that of secondary hnks. We first present different altematives to model the CSI imperfections and then, 
describe how the RA schemes have to be modified to account for them. 

The main change in the formulation when the CSI is not perfect is that the values of afc,i[n], /ife^ifri] and 
/i^2['^] (instantaneous CSI) are not longer deterministically known at instant n. Rather, the knowledge of 
afcjil'^]^ /ife^il'^] and will be probabihstic and time varying. As a result, the CSI now will correspond 

to the probability density function (pdf) of aA;,i[?T^], h'^ii'n], h^2VA available at time n. Such a pdf will be 
referred to as instantaneous belief and denoted as hk^i{x \n), h'^^{x \n), b^\{x |n), respectively. The specific 
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expression for the instantaneous belief will depend on the operating conditions of the system. Focusing on 
/i™]^[n] for illustrative purposes, two extreme examples are analyzed next. First, consider the case when the 
CSI is perfect. For this case, the value of h^^ln] at instant n is perfectly known, so that belief at instant 
n (instantaneous pdf) would be h'^^{x \n) = 5{x — h^\[n]), where 5{-) is a Dirac delta function. Consider 
now that no instantaneous measurements are available, so that only (long-term) statistical CSI is available. 
For the case of Rayleigh channels, the belief would be |n) = exp(x//i™]^)/^^^, where represents 

the average gain of the SU-to-PU channel. Clearly, in this case the belief would not vary with time. 

Three different sources of imperfections are considered here: quantized CSI, noisy CSI, and outdated CSI. 
For each of them, we first give a high level description of how to model the imperfections and the corresponding 
belief. Then, we provide several examples that will allow us to gain insights and be more specific. Regarding the 
first source of imperfections, research has consistently shown that feedbacking a small number of information 
bits about the instantaneous channel conditions to the transmitter (or schedulers) can allow near optimal channel 
adaptation [15]. To implement such schemes, the channel domain has to be quantized into non-overlapping 
quantization regions. Such quantization can be carried out jointly for different channels (vector quantization) 
or separately for each of them. Once the quantizer is known, at each instant the transmitter is notified of the 
region the instantaneous channels falls into. The instantaneous behef will be given by the pdf of the channel 
gain within the active region. A different source of imperfections is the presence of noise in the channel 
measurements. A zero-mean additive white noise is typically assumed for the noise, so that the belief will be 
given by the instantaneous channel measurement and the noise pdf. Many systems do not estimate the power 
gain of the channel, but its complex low-pass equivalent. In such a case, the (complex) noise would affect the 
low-pass equivalent. The belief in this case can be obtained from the actual measurement, the noise distribution 
and taking into account that power gain is the squared modulus of the complex low-pass equivalent. Finally, 
we also consider that the CSI may be outdated. This model is well motivated in CRs where sensing the (PU) 
channels entails a high cost so that they are cannot be sensed at every time instant. To update the belief in 
this case we need to assume a specific time-correlation model for the CSI. Based on that model and on the 
available measurements up to instant n, the behef is estimated using stochastic prediction/correction schemes. 
Example 1: A simple but very effective altemative to define the quantized CSI is to use a scalar quantizer 
for each of the channel gains. For example, focusing on the SU-to-SU channels, the domain of /ife*2['^] 
can be divided into L non overlapping intervals [t]^2~^ ■''''^2)' where I = 0, . . . ,L, stands for the Zth 
quantization threshold and t]!^2 = ^ '^l^-i — Clearly, in this case log2(L) bits suffice to identify 
the region (interval) channel /i™2[^] vcA.o. Most quantized CSI designs ignore the time-correlation of the 
channel and assume that the CSI is available instantaneously and free of errors [15]. In such a scenario, let 
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be the index which identifies the region the channel b^] f^^l^ ii^^. If the channel ['^l follows a 
exponential distribution (Rayleigh model) and its average gain is h^2' the belief of h^2['^] instant n 
is 6- 1^) = [eM-x/hT^2)K2]/MhT,2 e [rr:/-\r,"^2')}- 

Example 2: The task of acquiring the Boolean variable afe,i[n] is basically a detection problem. Consider 
that the output of the detection process is binary and denoted by i[n]. In order to incorporate the sensing 
errors into our model, we denote the probabilities of miss detection and false alarm as Pmd '■= Pi'{afc^i[n] = 
|ofe,i[ra] = 1} and PpA '■= P^{ak,i[n] = 1 |afe,i[n] = 0}, respectively. Based on those, we define Po|o ■= 
[(1 - Pfa)Po]/[(1 - Pfa)Po + PmdPi] and Pi|i := [(1 - Pmd)Pi]/[PfaPo + (1 - Pmd)Pi], where Pq and 
Pi stand for the long-term probabilities of Pr{afe,i = 0} and Pr{afe,i = 1}, respectively. If the time-correlation 
of Ofe,i[n] is ignored, then the behef of afc,i[n] at time n is simply: bk,i{x |n) = Pq\q6{x) + (1 — Po\q)S{x — 1) 
if afc,i[n] = 0; and \n) := (1 — Pi\i)S{x) + Pi\i5{x — 1) if afc_i[n] = 1. Schemes to update the behef 

for more general sensing models and that leverage the time-correlation of the PUs activity can be found in, 
e.g., [17]. 

Example 3: In this example, we design prediction/correction schemes for a practical channel/measurement 
model for the SU-to-PU channels. Let ff^ii^] be the low-pass equivalent of the SU-to-PU channel, so that 
^ril""] ~ brit^-lP- ^^^^ assume that 5^*1 [ra] is a complex Gaussian process with independent real and 
imaginary parts (Rayleigh model). For notational convenience we will deal with ^^^[n] as a 2 x 1 vector whose 
first and second entries correspond to the real and imaginary parts, respectively. The time dynamics of [n] 
are assumed to follow a first-order Markovian model with [n] = iQ^y^'^gk^i[n — 1] + (1 — Qk')^^'^d'^i[n] 
where g]^'' represents the autocorrelation coefficient and d^^ [n] an innovation process independent of g'^^ [n] . 
The process d^^ [n] is assumed to be white and complex Gaussian distributed with zero mean and diagonal 
CO variance matrix ^12, where I2 is the 2x2 identity matrix [9]. Once the model of the ground-truth channel 
has been described, we introduce the model for the measurements and errors. For such a purpose, let s^[n] 
denote a Boolean variable which is one if the channel is sensed at instant n and zero otherwise. Moreover, 
let [n] denote the noisy measurement of [n] obtained if s^[n] = 1. The measurement is modeled as 
5^1 1*^] ~ 5^1 ['^] + '^rl'^] where v^[n] is a white noise independent of 5)^*1 [n] which follows a complex 
Gaussian distribution with zero mean and diagonal covariance matrix t'^l2- Let fg^^[n]{x) denote the pdf of 
^■^^[n] at instant n, conditioned to all measurements up to instant n. Under the previous model, it readily 
follows that fg^^ (x) is Gaussian pdf and its mean and covariance (denoted, respectively, as /x^ [n] and [n]) 
suffice to describe the full distribution. The stochastic iterations to update jj^ [n] and t;™ [n] are described next. 

If sf[n] = 0, then it holds that = igfy/'^fif[n - 1] and ij™[n] = ei^'^J^'[n - 1] + (1 - Ql'')lh. If 

s^[n] = 1, we first update the behef of the previous instant to get the predictions fi^[n] = {Q^y^'^fi^in — 1] 
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and v^[n] = g^v^[n — 1] + {1 — 0^)^12- Then, we use the measurement g'j^[n] to correct the predictions as 
follows: 

M^N = (^rW + '^D-'lCN^rN + ^N) (16) 

vT[n] = mn] + iyrr\vk[nm- (17) 

Clearly, when s^[n] = 1 the updates correspond to those of a classical Kalman filter. Different pre- 
diction/correction steps will be required if either the time dynamics or the sensing errors are modeled 
differently. See, e.g., [17] for altemative models. As mentioned before, based on fgf^in]{^) (instantaneous 
pdf of ^fe^ifjT-]), the belief b^^{x \n) (instantaneous pdf of can be obtained by using the transformation 

To finish this section, we introduce notation h[n] to denote the overall imperfect CSI available at time n. 
For example, suppose that: a) the CSI of the SU-to-SU gains is quantized as described in Example 1; b) the 
errors on the activity of the PUs follow the model described in Example 2; and c) the CSI of the SU-to-PU 
channels is outdated and noisy as described in Example 3. With these operating conditions, h[n] is a vector 
of length (3M + 1)K containing: a) the region index of each of the gains of the MK SU-to-SU links; ii) the 
probability of each of the K PUs being active; and iii) the means and variances of the MK SU-to-PU hnks. 
Clearly, based on the information gathered on h[n], the instantaneous beliefs bk^i{x \n), b^-^{x |n), 6^2(^ 1*^) 
can be trivially obtained. For notational convenience, we will use b(x |n) to denote the belief of the CSI 
of the overall system. Moreover, b(a: \n) will be written as h{x |h[n]) whenever is convenient to stress the 
dependence on h.[n]. 

A. Modifying the RA schemes 

The first step to design RA schemes capable of accounting for CSI imperfections is to modify the formulation 
of the constraints which depend explicitly on the instantaneous CSI. Strictly speaking, the formulation of the 
long-term constraints in (2), (3) and (4) (and the objective C2) do not have to be modified. One just has to 
take into account that the total expectation Eh[ ] in those constraints can be rewritten as IEj^[Ej^^^|^^[-]]. The 
notation emphasizes that the inner expectation is taken over afe,i[n], and ^^2['^] according to the pdfs in 

b(x|h). Differently, the short-term constraints in (7) and (8) need to be modified. When the CSI is imperfect, 
those constraints involve random variables, so that strict satisfaction of the constraints may be impossible 
(e.g., if the instantaneous belief has infinite support). As a result, the constraints have to be reformulated. A 
reasonable reformulation is to take expectations across the instantaneous belief at both sides of the constraints 
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and consider 

^6..i(a;|n)[afe,lN]^6^i(cc|n) [hk,lH] 

< IE6^,i(x|n)[«fe,lH]Pifc,li (18) 
(a;|n) [flfe,! H]^b^^, {x\n) [^^fe,! (^M Hpt,2 H )] 

>^h^,^{x\n)[a'k,i[n]]fk,i- (19) 

Note that to gain intuition in (18) and (19) we have implicitly assumed that afc,i [n] and hJ^^ [n] are independent, 
so that the expectations were obtained separately. The long-term expectations in (3) and (4) are different from 
those in (18) and (19). In the former, the expectations were taken considering all time instants. In the latter, 
the expectations are taken at instant n and only over the CSI uncertainties. Clearly, as the knowledge of the 
CSI improves, the beliefs approximate to a Dirac delta centered in the actual value of the channel and hence, 
the constraints in (18) and (19) approximate to those in (7) and (8). As we did in Sec. II-C, to handle the 
short-term DSA constraints we solve (18) and (19) w.r.t. Pfc*2H ^^'^ redefine the maximum instantaneous peak 
power constraint as p^2W ^jT W'Pfe*max}' where x^[n\ and y'^[n] are the roots of (18) and 

(19), respectively. Another reasonable reformulation to handle the CSI imperfections is to consider that (7) 
and (8) need to hold with a certain short-term probability (e.g., the probability of the interference power at 
time n exceeding j has to be less than a certain value). The procedure to deal with the constraints would 
be similar. The instantaneous belief would be used to solve the constraints w.rt. the jo™2 [^] ' the corresponding 
values of x™[n] and y["[n] would be found, and such values would be used to obtain p™2W- 

With these modifications in mind, it can be shown (see App. A) that the optimal RA with imperfect CSI is 

^r«2N) := ^HAn)[Pk{PkM)], (20) 

(21) 



^fe^2 [^^l • = 1 {m=arg max, (pj,* [n] ) } 

■^{P^^N>0 V m=0}- (22) 

In most practical scenarios, the SU-to-SU channels are statistically independent of the SU-to-PU channels. 
The same holds true for the activity of the PUs. In such a case, the indicator in (20) can be written as 

Pk^huAAn) [«ik,iM]^6^;i(^;|n)[^fe,i(^fc^iMPfc^2N)]- This way, we observe that the fact of having imperfect 
CSI does not modify the favorable (separable) structure of the optimal RA. The main change is that the 



P^2*N 



arg max 
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optimization in (21) has to be carried out taking into account the CSI imperfections. In most cases, this will 
entail a higher computational cost (because the expectations cannot be found in closed form and have to be 
estimated numerically). If computational burden is a major problem, robust designs that guarantee a worst-case 
performance and do not require computing expectations are a reasonable alternative. 

The last step to account for the CSI imperfections is to modify the schemes that compute the multipliers. If 
the stochastic schemes in (13)-(15) are used, a simple way to accomphsh that task is to replace the instantaneous 
updates in the right hand side of (13)-(15) with their expectations over the instantaneous behef h{x\n). In 
such a case, the results in Prop. 1 still hold. In fact, if the expectations over the instantaneous belief were 
replaced with simple unbiased and bounded estimates, then the results in Prop. 1 would hold too. 

VI. Numerical simulations 

The performance of our schemes is analyzed here via numerical simulations. Since the schemes are 
optimal, the main purpose is to get insights into the optimal policies and the role of each of the DSA 
constraints considered. Two test cases are presented. The first one focuses on the overall sum-capacity 
performance (optimality) and feasibility of the developed schemes. The effects associated with modification 
of the interference levels and DSA constraints are analyzed, and perfect CSI is assumed. The second test case 
analyzes the impact of CSI imperfections. 

To simulate challenging propagation conditions for the SUs, the amphtudes of the secondary links are 
Rayleigh distributed (so that /ife*2[^] follows an exponential distribution), the average SNR for all users and 
bands is 3dB, and the frequency selectivity is assumed to be high, so that gains across bands (sets of subcarriers) 
are independent. The model for the SU-to-PU hnks is Rayleigh too, with average gain equal to OdB. The gain 
of the PU-to-PU link is lOdB and every PU is assumed to be active during 80% of the time. The remaining 
parameters are set as follows: M = 5, K = 10, /3™ = 1, = 1, p^j = 0.15, and Skj = 5%. The number 
of time instants simulated is 20000, the results presented correspond to one single realization of the CSI 
processes and time averages are calculated discarding the first half of the simulated instants. 
Test Case 1: optimality and feasibility. To label the schemes in this section, "A" stands for average, "I" for 
instantaneous, "P" for power and "C" for capacity. Seven RA schemes are tested: SI) the optimal scheme that 
maximizes the performance of the SUs and ignores all DSA constraints (labeled as "None"); S2) the optimal 
scheme in this paper considering the long-term interference power constraint (3) and the long-term rate loss 
constraint (4) (labeled as "APC"); S3) a scheme like APC, but setting pk i = oo, i.e., ignoring (3) ("AC"); 
S4) a scheme like APC, but setting et = 1, i.e., ignoring (4) and yielding a scheme very similar to the one 
in [11] ("AP"); S5) a scheme like AP, but replacing (3) with its instantaneous counterpart in (7) ("IP"); S6) 
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a scheme like AC, but replacing (4) with its instantaneous counterpart in (8) ("IC"); and S7) a scheme like 
APC, but replacing both (3) and (4) with their instantaneous counterparts (7) and (8) ("IPC")- In all cases the 
CSI is assumed to be error free. 

The numerical results corresponding to this test case are plotted in Figs. 1-3. The vertical axes in each of 
the figures represent the following: in Fig. 1, the long-term weighted sum-capacity of the SUs (denoted as 
C2); in Fig. 2, the long-term interference power at the PUs (the value corresponds to the average across PUs 
and is denoted as pi); and in Fig. 3, the loss on the long-term capacity at the PUs (the value corresponds to 
the average across PUs and is denoted as ei). Each of the figures comprises 4 subplots, the horizontal axis 
in each of the subplots corresponds to the variation of a different parameter: (subplot a); 7^ (subplot b); 
pk,i (subplot c); and Sk (subplot d). The long-term power transmitted by the SUs is not plotted because it is 
always 1, which is the value set for pip. 




(a) SNR for SU-to-PU: hZi (b) SNR for the PU-to-PU: 




0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.02 0.04 0.06 0.08 0.1 0.12 0.14 



(c) Max. power at the PUs pk^i (d) Max. capacity loss at PUs: ei 

Fig. 1: Variation of C2 w.r.t. h^i, Jk, Pi, and ei. 

The main conclusions are: CI) Our schemes are always able to satisfy the constraints considered in each of 
the schemes. C2) The DSA long-term constraints achieve a better objective (sum capacity) than their short- 
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Fig. 2: Variation of pi w.r.t. h"^^, 7^, pi, and ei. 



temi counterparts. Next we briefly elaborate on them. We begin by analyzing the feasibility claim. Figs. 2 
and 3 confirm that the schemes always satisfy the constraints (small variations around the nominal value are 
due to the fact that the values plotted have been computed averaging over a. finite number of instants). Indeed, 
we observe that: "None" always violates the constraints; "APC" always satisfies both of them; "AC" always 
satisfies the long-term capacity loss constraint -Fig. 2- and "AP" always satisfies the long-term interference 
power constraint -Fig. 3-; the schemes "IPC", "IP" and "IC" always oversatisfy the long-term constraints 
in Figs. 2 and 3. We also observe that "AP" and "AC" always satisfy the active constraint with equality 
(corroborating that they try to interfere the PUs as much as they are allowed to, so that the sum-rate of the 
SUs is as high as possible). We also observe that when the constraints are set to high (loose) values (see Figs. 
2.C and 3.d), the performance of "AP" and "AC" (the schemes adhering to long-term constraints) coincides 
with that of the SI (the scheme that ignores the DSA constraints). This indeed corroborates that our schemes 
are optimal. Moving to conclusion C2, the plots reveal that not only scheme "APC" performs always better 
than "IPC", but also that "AP" and "AC" perform better than "IP" and "IC", respectively. In other words, 
the schemes adhering to long-term DSA constrains always achieve a higher objective than their short-term 
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(a) SNR for SU-to-PU: (b) SNR for PU-to-PU: 7fe 




(c) Max. power at the PUs pk,i (d) Max. capacity loss at PUs: ei 



Fig. 3: Variation of ei w.r.t. /i^^^, 'jk, pi, and ei. 

counterparts. Intuitively, the long-term constraints allow SUs to interfere the PUs provided that the reward 
for the secondary network is high enough. This is referred to as "cognitive diversity" in [30], [29]. The plots 
also reveal that the performance gap between the short-term and long-term formulations is larger when the 
scenario is more demanding. 

Test Case 2: imperfect CSI. In this test case, we simulate incorporate imperfections to the CSI. The objective 
is threefold: 01) to numerically assess the performance (sum-capacity) loss due to the presence of CSI 
imperfections, 02) to show that our schemes are robust to CSI imperfections and adhere to the DSA constraints 
considered, and 03) to show that schemes that do not explicitly account for such imperfections either violate 
the DSA constraints or incur a significant loss of performance. Three different experiments are run. Only 
the APC and IPC schemes are simulated in this test case. The specific setup and the model for the CSI 
imperfections in each of the setups are described next. 

In the first experiment, we consider that the CSI of the secondary network is quantized. The regions are 
designed using a scalar quantizer that splits the SNR domain into equally-probable regions. The results in Table 
I correspond to different quantization levels and demonstrate that for the average (APC) scheme, quantization 
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TABLE I: Variation of the number of quantization regions: Sk,i = 5.0% and pk,i = 0.20. 





APC 


IPC 


L 


C2 


e-i 


Pi 




ei 


Pi 


1 


7.97 


4.8 


0.14 


7.25 


2.2 


0.06 


2 


12.41 


5.0 


0.15 


8.76 


2.1 


0.06 


4 


13.82 


5.0 


0.16 


10.40 


2.7 


0.07 


8 


14.66 


5.0 


0.15 


10.48 


2.5 


0.07 


00 


15.16 


5.0 


0.16 


14.45 


4.0 


0.12 



of the CSI leads to small optimality loss w.r.t. the case of perfect CSI. Moreover, the resulting gap shrinks as 
the number of regions increases, being negligible when the number of regions is more than four (two feedback 
bits). The loss of optimality is more severe for the instantaneous (IPC) scheme. The reason is that none of 
the modes is activated during most of instants the PU is active. 

In the second experiment, we assume that the information about the activity of the PUs is noisy and outdated. 
The time evolution of each ak[n] follows a Gilber-EUiot model with transition probabilities Pn = 0.975, 
Pio = 0.025, Poo = 0-9> and Pqi = 0.1. Two sensing configurations are simulated. In the fist one, Pfa = 0.03, 
Pmd = 0.02, and the activity is measured every Na = 5 slots. In the second one, we set Pfa = 0.1, 
Pmd = 0.1 and Na = 10. We compare the performance of our schemes (3rd and 7th rows) with that of 
schemes: i) knowing the actual CSI, ii) ignoring the CSI imperfections, and iii) relying only on statistical CSI 
(labels "-i", "-ii" and "-iii" are used in the table). Clearly, as the sensor accuracy gets worse, the sum-capacity 
of the SUs gets smaller The reason is simple, if the quality of the sensor is high, SUs can take advantage of 
time instants when the PUs are not present (in those instants the transmit power of the SU can be as high as 
they desire). Differently, when the quaUty of the sensors is poor, the SUs have to act as if the PUs were always 
present. This in turn implies that the loss due to sensing imperfections will be higher in those scenarios where 
the probability of the PUs being active is smaller (recall that we have set the probability of a PU being active 
to 80%). Last but not least, we observe that our schemes always remain feasible even if the CSI contains 
imperfections. That is not the case if the schemes are implemented as if the CSI were perfect (see APC-ii and 
IPC-ii). Clearly, the sum-rate for APC-ii is higher than that of our scheme. The reason is that guaranteing the 
interference constraints with higher level of CSI requires more conservative transmission strategies. 

Finally, the CSI of the SU-to-PU links is assumed to be noisy so that the ratio between the power of the true 
channel and the measurement noise is 4dB. As in the previous experiment, we simulate APC and IPC schemes 
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TABLE 11: Imperfections in the detection schemes: Sk,i = 5.0% and pk,i = 0.20. Rows 3-6 correspond to 
[PFA,PMD,Na] = [0.02, 0.03,5]. Rows 7-10 correspond to [Pfa, Pmd, iVJ [0.1,0.1,10]. 





APC 


IPC 


Version 




- i 


Pi- 




- i 


f'l 


Optimal 


14.82 


5.0 


0.15 


14.24 


3.9 


0.12 


-i 


15.18 


5.0 


0.15 


14.46 


4.3 


0.13 


-ii 


15.22 


5.5 


0.17 


14.51 


8.7 


0.17 


-iii 


14.39 


4.3 


0.15 


13.57 


3.1 


0.09 


Optimal 


14.54 


5.0 


0.15 


13.80 


3.3 


0.10 




15.17 


5.0 


0.15 


14.46 


4.3 


0.13 


-ii 


15.30 


5.6 


0.17 


14.68 


12.7 


0.21 


-iii 


14.39 


5.0 


0.15 


13.57 


3.1 


0.09 



TABLE ni: Imperfections in the CSI of the SU-to-PU links: ek,i = 5.0% and pi = 0.15. 





APC 


IPC 


Version 




ei 


Pi 


C2 




Pi 


Optimal 


14.45 


5.0 


0.15 


8.68 


3.0 


0.08 


-i 


15.17 


5.0 


0.15 


14.46 


4.2 


0.12 


-ii 


14.50 


5.8 


0.19 


7.5 


3.0 


0.08 


-iii 


12.50 


4.3 


0.15 


7.89 


2.9 


0.08 



and compare them with -i, -ii and -iii. Our schemes are feasible, and the achieved sum-rate is between the 
one obtained by the scheme that knows the actual CSI (-i) and the one that rehes only on statistical CSI (-iii). 
Regarding the schemes ignoring the CSI imperfections, APC-ii achieves a slightly higher sum-rate that the 

scheme accounting for imperfections, but violates the interference constraints. We also observe that APC-ii 
achieves smaller sum-rate than APC-i, the reason being that the variance of the noisy channel is larger The 
advantages are clearer in the instantaneous case: IPC-ii not only violates the constraints (this is not apparent 
in the table, which only hsts average values), but also yields the worst performance [cf. the formulation in 
(19)]. 

VII. Conclusions 

This paper investigated the design of stochastic algorithms for CR scenarios with multiple primary and 
secondary users operating over time-varying (fading) channels. One of the most critical issues in CRs is how 
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SUs coexist with (limit the interference to) PUs. Among the different metrics considered in the paper, the most 
important is the guarantee on the long-term (ergodic) capacity loss experienced by the PUs. Guaranteeing a 
certain rate for PUs is typically challenging because the presence of interference powers render the optimization 
non-convex. For the operating conditions considered in the paper we showed that two important facts hold. 
The first one is that the optimization problem which gives rise to the resource allocation schemes has zero 
duality gap, so that Lagrangian relaxation can be used without losing optimality. The second one is that 
in the dual domain the non-convex problem can be decoupled (separated) across channels and users. The 
latter implies that the optimization needs to be carried out only over a scalar variable, and thus enables 
implementation of efficient hne-search algorithms. It was shown that the optimal resource allocation amounts 
to the maximization of a quahty link functional which weights: the quality of the secondary hnks and the 
damage to the primary users. The terms in the quahty link functional depend on the instantaneous CSI (which 
contains imperfections), and on several Lagrange multipliers (whose value depended on the long-term behavior 
of the system and the requirements of the primary and secondary networks). Simple stochastic algorithms that 
account for the imperfections in the sensing process are used to estimate and predict the actual value of 
the channel. Similarly, stochastic algorithms to estimate the optimum value of the multipliers online were 
also developed. Future work includes consideration of multiple antenna, development of distributed (including 
multi-hop) implementations, and joint design of the sensing and resource allocation schemes. 

Appendix A: On the optimality of the RA 
As pointed out in Sec. m, there are three sources of nonconvexity in (9): i) scheduhng coefficients 
are constrained to belong to the non-convex set {0, 1}; ii) monomials w'k'2PT2^ '^^2'^^2' '^^2''^k,i ^re 
not jointly convex; and iii) constraints (4) are not convex w.rt. p™2- this appendix, we first discuss how 
the two first sources of non-convexity can be bypassed. Then, we analyze why the reformulated problem has 
zero-duality gap. Finally, we show that the RA in (10)-(12) is optimum. 

The way do deal with i) is to relax ^ i^i 1} consider ^ [Oj general, such a relaxation 
will give rise to solutions that do not satisfy the original constraint w'^2 ^ i^i However, it can be 
shown that if w'^2 ^ {Oj 1} is replaced with w^2 ^ [Oj 1]' the solution of (9) satisfies w'^2 ^ {Oj 1} with 
probability one. This easily follows from the expression for w'j^2 ™ (1^), which was derived considering 
w^2 ^ [Oj Clearly, (12) dictates that w^2 either zero or one. The only problem arises if there are two 
SUs m\ and 1712 with positive transmit power satisfying y^^^ {p^^*[n]) = f'^^{p^^*[n]) = max; (/3^(p^*[n]). 
Since (/^j, and pjf are continuous functions of several (continuous) random variables, the probabihty of that 
event is zero. For further details on this specific issue, we refer the reader to the end of this appendix, where the 
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optimal scheduling is found [cf. (26)]. Nonetheless, it is worth clarifying that from a practical perspective, the 
problems associated with the event of two users achieving the same indicator (which happens if, for example, 
the channel is a discrete random process) can be easily bypassed. For example, by using smooth scheduling 
approximations, which are asymptotically optimal; see [16] for details. 

To deal with ii) we follow the same approach used in other RA problems; see, e.g., [16]. The idea is to 
define auxiliary (dummy) variables '■— '^T2PT2- "^^^ problem in (9) is then reformulated replacing 
with Vk2l'^^i- After straightforward mathematical manipulations, it can be shown that: a) the non-convexity 
caused by the monomials is indeed solved and b) the reformulated problem yields the same (Karush-Kuhn- 
Tucker) KKT conditions than those of the original (9). More specifically, the only difference between the 
solution of (9) considering the original variables and the one considering the dummy variables are the values 
of f*''' users m such that = 0. Clearly, such a difference is irrelevant from a performance perspective 
and hence, the optimization can be carried out using any of them. 

Regarding the zero duality gap in ill), the basic idea is that the source of non-convexity comes from a 
constraint of the form Ex[5'(y,x)], where g{y,x.) is a non-convex function w.r.t. y, and x is a continuous 
random process with infinite support. Here y is the power; x is the CSl; and g{y, x) is the expression for the 
instantaneous capacity, i.e. log2(l + 7fc,i/(l + ^fc*i['^]Pfe^2 ["-])• proof is omitted due to space limitations, 
but we refer the reader to either [24], or [22, App. A] for further details. 

To derive the optimum RA in (10)-(12) we start by writing the Lagrangian of (9). To do so, let z be a vector 
containing all primal variables: w^2{^)' Pfc*2('^) V(A;,m, h). Note that z has infinite length because h takes 
infinite values. Moreover, let A be a vector containing all dual variables (multipliers): tt"*, 9k, pk \l{k,m). 
The Lagrangian is then 



/:(z, A) = Eh [ /3"'<2(h)r^2(/^^2<2(h)) 

m,k 



-E^" E<2(hK2(h)-Pm,2 
m ^ k ^ 

E (^k'^k,! (e <2miiPk,2{^) - Pk, 

k ^ m 

+ E f E <2(h)rfe,i(/i^i<2(h)) - fk, 



k ^ m ' 

For a given A, we need to maximize £(z, A) w.r.t. z and guarantee that the solution satisfies the short-term 
constraints in (9b). The structure of £(z, A) and the constraints allows for a separate optimization w.r.t. w^^!^ 
and p^2(h)- First ^"^^ expression for ji^^^i which holds for any value of vf^^^^. Then, we 
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will use ^?^2(h, A) to find wf *{h,X). 

To handle perfect and imperfect CSI jointly, the expectation in (23) is written as Eh[ 
that 



, so 



^^,^) = ^T^"'Pm,2 + ^^i,[^b(:c\h)[(^k,l{OkPk,l - Pkrk,l)]] 



k 



rn,k 



(24) 



Clearly, when the CSI is perfect, the inner expectation is not needed and can be dropped. Taking into account 
that the two first terms in (24) do not depend on z, and using the definition of the hnk quality indicator (p^ 
in (20), maximizing >C(z, A) w.r.t. z amounts to maximize 



£'(z,A) :=Er 



m.k 



(25) 



w.r.t. z. Clearly, the unconstrained maximization of £'(z, A) can be performed separately for each of the 
(m, k, h) terms. However, the optimal solution also needs to satisfy the instantaneous constraints in (9b), 
namely: X]m^™2('^) = 1; < '^fc2('^) — 1' ^'^'^ ^ < jo™2('^) — Pk'2i^)- Indeed, since the instantaneous 
constraints on p^gC^) are decoupled across m, k and h, the optimization over the power can be performed 
separately for each of the (m, fc, h) terms. To find p^^^^i A) we consider two different cases: i) if > 0' 

then the optimum p^^^^^) is found by maximizing (^^(h, A,p^2(^)) projecting the solution onto 
the feasible interval [0,p^2(^)]' ii) if '^^2^) ~ ^' ^'^^ value of i^ equally optimum, 

including the one which is optimum for i). As a result, we can conclude that finding p^^^i A) by maximizing 
(^^(h,p^2(^) A)) and projecting onto [0,p^2(i^)] i^ optimum for any value of i^ indeed the 

result in (11) and (21) for the cases of perfect and imperfect CSI, respectively. 

Once p™2(h) A) are known \/{k,m), we are ready to find ti;™|(h, A). To carry out this task, we substitute 
j5™2(Ii)A) into (24) and rely on the fact that the short-term scheduling constraints are decoupled across 
channels. As a result, for each h, it suffices to solve K instances (one per k) of 



M 



max 

{«'^!2(h)};tf= 



s. to : 



5;<2(h)<^r(h,A,p^2*(h))) 

=0 m=0 



M 

E 

m=0 



1 



<2(h) 
< ^^^(h) < 1 Vm 



(26a) 

(26b) 
(26c) 
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whose solution yields {w^2i^j ^)}m=o- Since (26) is linear in w'^2{^)' the solution is straightforward and 
consists of setting i(;^'|(h, A) = 1 for the user m which maximizes (f'^{h, \,p^2(^: ^))' while setting 
w^2 (h, A) = for all other users. If the winner user is unique, this policy can be written in closed form using 

the indicator function as lyT^^fh, A) = Ir, / ,nnm . If more than one user attains the maximum 

k,2y ' / |(rri=argmax( (p5.(p5_ (h,A)))| 

(this event will be referred to as a tie), choosing any of them is optimum from the point of view of (26). 
However, since (^™(h, A,p^2(^' A)) is a continuous non-negative random variable, ties in practice only occur 
if p'^2 {^1 A) = for all m. In such a case, the LQI is the same for all M-|- 1 users and any of them could be se- 
lected. In this situation, we assign the access to the virtual user m = 0, i.e. we set u;^2 i^i A) = for all m> 0. 
Combining these two conditions we can write w^*{h,X) = l{(^=axgms«, ^i(pi'(h,A)))}l{(p^^(h,A)>o v m=o)} 
for all m > 0. This is precisely the solution in (12) and (22), for the cases of perfect and imperfect CSI, 
respectively 
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